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DETAILED ACTION 
Restriction I Election 

1 . Applicant's election with traverse of Applicant elects Group I, claims 1-14, 20- 
28, and SEQ ID NO: 1 only, in the reply filed on 3/16/07 is acknowledged. The traversal 
is on the ground(s) that "...not seen as ...undue burden This is not found 
persuasive because while a search of the prior art for one group may overlap with that 
of another group, they are not co-extensive of each other and thus would represent 
undue burden on Office resources. 

2. Claims 15-16, 18-19, 29-33 and sequences other than SEQ ID NO: 1 are 
withdrawn from further consideration pursuant to 37 CFR 1.142(b), as being drawn to a 
nonelected inventions, there being no allowable generic or linking claim. Applicant 
timely traversed the restriction (election) requirement in the reply filed on 3/16/07. 

3. Claims 1-14, 20-28 are examined in the instant application. 

4. The requirement is deemed proper and is therefore made FINAL. 

Sequence Listing 

5. Applicant's computer readable format sequence listing has been entered. 

Drawings 

6. The drawings are acceptable for examination. 



Claim Objections 
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7. Claim 1 is objected to because of the following informalities: the claim recites 
non-elected SEQ ID NOs. Appropriate correction is required. 

8. Claims 9, 21 are objected to because of the following informalities: the claim 
recites the . . host organism ... is ... a plant cell." Plant cells are not organisms. 
Appropriate correction is required. 

9. Claim 1 is objected to because of the following informalities: the claim recites 
an improper Markush group drawn to 2 members of a genus and a separate genus. 
Appropriate 'correction is required. 

Claim Rejections - 35 U.S.C. §112, second paragraph 

The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and 
distinctly claiming the subject matter which the applicant regards as his invention. 

10. Claims 1-14 and 20-28 rejected under 35 U.S.C. §112, second paragraph, as 
being indefinite for failing to particularly point out and distinctly claim the subject matter 
which the applicant regards as the invention. 

11. In Claims 1-14 and 20-28, it is unclear what is being retained in the derived 
product. 

Claim Rejections - 35 USC § 112, 1 st , paragraph, written description 

The following is a quotation of the first paragraph of 35 U.S.C. 112: 

The specification shall contain a written description of the invention, and of the manner and process of 
making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
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art to which it pertains, or with which it is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the inventor of carrying out his invention. 

A description of a genus of cDNAs may be achieved by means of a 
recitation of a representative number of cDNAs, defined by nucleotide 
sequence, falling within the scope of the genus or of a recitation of 
structural features common to the members of the genus, which features 
constitute a substantial portion of the genus. University of California v. Eli 
Lilly and Co., 43 USPQ2d 1398 (Fed. Cir. 1997). (emphasis added). 

See also Fiddes v. Baird, 30 USPQ2d 1481 (Bd. Pat. App. & Int. 
1993); In re Curtis, 69 USPQ2d 1274, 1282 (Fed. Cir. 2004); Fiers v. 
Revel, 25 USPQ2d 1601 (Fed. Cir. 1993); Amgen Inc. v. Chugai 
Pharmaceutical., 18 USPQ2d 1016 (Fed. Cir. 1991). 

12. Claims 1-14 and 20-28 are rejected under 35 U.S.C. 112, first paragraph, as 
failing to comply with the written description requirement. The claim(s) contains subject 
matter which was not described in the specification in such a way as to reasonably 
convey to one skilled in the relevant art that the inventor(s), at the time the application 
was filed, had possession of the claimed invention. 

The claims are broadly drawn to an inducible promoter, SEQ ID NO: 1 (NIMIN-1 
promoter), biologically active derivatives of SEQ ID NO: 1 of any length and sequence 
from any source, "chemicals" and "organic compounds" capable of inducing SEQ ID 
NO: 1, and undescribed heterologous nucleic acid sequences encoding wild-type or 
mutant NIMIN-1 promoters. Said sequences include genes encoding promoters from 
any source (Claim 1-14, 20-28), as well as any sequence from any source encoding any 
promoter which has any biological activity of any kind, including irreversible self-excision 
from the genome. 
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In contrast, Applicant has only described SEQ ID NO: 1 (sequence listing), GUS 
expression data with NIMIN-1 (SEQ ID NO: 1)-GUS fusions (Examples 2-3), and 
transgenic tobacco plants containing SEQ ID NO: 1-GUS constructs(Examples 2-3). 

Applicant does not describe biologically active derivatives of SEQ ID NO: 1 of 
any length and sequence from any source, any "chemicals" (e.g., mercury or water) or 
any "organic compounds" (e.g., cyanide or glyphosate) capable of inducing SEQ ID NO: 
1, and undescribed heterologous nucleic acid sequences encoding wild-type or mutant 
SEQ ID NO: 1 promoters. Said sequences include genes encoding promoters from any 
source (Claim 1-14, 20-28), as well as any sequence from any source encoding any 
promoter with any or no promoter activity. 

Applicant has not described the structure or any other relevant characteristics for 
all nucleic acid sequences encoding biologically active derivatives of NIMIN-1 
promoters, or a representative number of same and a literature review does not indicate 
that they are well known to one of skilled in the art. Applicant has only described nucleic 
acid sequences encoding SEQ ID NO: 1 from Arabidopsis. 

Applicant fails to describe a representative number of biologically active 
derivatives of SEQ ID NO: 1, or chemicals or organic compounds capable of inducing 
said biologically active derivatives. Applicant only describes SEQ ID NO: 1. 
Furthermore, Applicant fails to describe structural features common to members of the 
claimed genus of biologically active derivatives of SEQ ID NO: 1 . Hence, Applicant 
fails to meet either prong of the two-prong test set forth by Eli Lilly. Furthermore, given 
the lack of description of the necessary elements essential for NIMIN-1 promoter 
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activity, it remains unclear what features identify NIMIN-1 promoters. Since the genus 
of NIMIN-1 promoters or the genus of biologically active fragments thereof has not been 
described by either specific structural features or a representative number of species, 
the specification fails to provide an adequate written description to support the breath of 
the claims. 

The Federal Circuit has clarified the application of the written description 
requirement. The court stated that a written description of an invention "requires a 
precise definition, such as by structure, formula, [or] chemical name, of the claimed 
subject matter sufficient to distinguish it from other materials." University of California v. 
Eli Lilly and Co., 119 F.3d 1559, 1568; 43 USPQ2d 1398, 1406 (Fed. Cir. 1997); See 
also Fiddes v. Baind, 30 USPQ2d 1481 (Bd. Pat. App. & Int. 1993). The court also 
concluded that "naming a type of material generally known to exist, in the absence of 
knowledge as to what that material consists of, is not a description of that material." Eli 
Lilly. Further, the court held that to adequately describe a claimed genus, Patent Owner 
must describe a representative number of the species of the claimed genus, and that 
one of skill in the art should be able to "visualize or recognize the identity of the 
members of the genus." Id. 

Finally, the court held: 

A description of a genus of cDNAs may be achieved by means of a 
recitation of a representative number of cDNAs, defined by nucleotide 
sequence, falling within the scope of the genus or a recitation of structural 
features common to members of the genus, which features constitute a 
substantial portion of the genus. Id. 
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See also MPEP Section 2163, page 174 of Chapter 2100 of the August 2005 
version, column 1 , bottom paragraph, where it is taught that 

[T]he claimed invention as a whole may not be adequately 
described where an invention is described solely in terms of a method of 
its making coupled with its function and there is no described or art- 
recognized correlation or relationship between the structure of the 
invention and its function. A biomolecule sequence described only by a 
functional characteristic, without any known or disclosed correlation 
between that function and the structure of the sequence, normally is not a 
sufficient identifying characteristic for written description purposes, even 
when accompanied by a method of obtaining the claimed sequence. 

See also Amgen Inc. v. Chugai Pharmaceutical Co. Ltd., 18 USPQ 2d 1016 at 
1021, (Fed. Cir. 1991) where it is taught that a gene is not reduced to practice until the 
inventor can define it by "its physical or chemical properties" (e.g. a DNA sequence). 

Given the claim breadth and lack of guidance as discussed above, the 
specification fails to provide an adequate written description of the genus of sequences 
as broadly claimed. Given the lack of written description of the claimed genus of 
sequences, any method of using them, such as transforming plant cells and plants 
therewith, and the resultant products including the claimed transformed plant cells and 
plants containing the genus of sequences, would also be inadequately described. 
Accordingly, one skilled in the art would not have recognized Applicant to have been in 
possession of the claimed invention at the time of filing. See The Written Description 
Requirement guidelines published in Federal Register/ Vol. 66, No. 4/ Friday January 5, 
2001/ Notices: pp. 1099-1111. 



Application/Control Number: 10/566,201 



Page 8 



Art Unit: 1638 

Claim Rejections • 35 U.S.C. §112, first paragraph, enablement 

The following is a quotation of the first paragraph of 35 U.S.C. 112: 

The specification shall contain a written description of the invention, and of the manner and process of 
making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
art to which it pertains, or with which it is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the inventor of carrying out his invention. 



13. Claims 1-14 and 20-28 are rejected under 35 U.S.C. 112, first paragraph, 

because the specification, while being enabling for SEQ ID NO: 1, does not reasonably 

provide enablement for biologically active derivatives of SEQ ID NO: 1 . The 

specification does not enable any person skilled in the art to which it pertains, or with 

which it is most nearly connected, to make and use the invention commensurate in 

scope with these claims. 

The Wands court set forth the enablement balancing test: 

Factors to be considered in determining whether a disclosure 
meets the enablement requirement of 35 USC 112, first paragraph, have 
been described by the court in In re Wands, 858 F.2d 731 , 8 USPQ2d 
1400, 1404 (Fed. Cir. 1988). Wands states at page 1404, "Factors to be 
considered in determining whether a disclosure would require undue 
experimentation have been summarized by the board in Ex parte Forman. 
They include (1) the quantity of experimentation necessary, (2) the 
amount of direction or guidance presented, (3) the presence or absence of 
working examples, (4) the nature of the invention, (5) the state of the prior 
art, (6) the relative skill of those in the art, (7) the predictability or 
unpredictability of the art, and (8) the breadth of the 'claims." 

M.P.E.P. § 2164.01(a); See also Ex Parte Forman 230 USPQ 546, 547 (BdPatAppInt 

1986); See also Enzo Biochem, Inc., v. Calgene, Inc., 188 F.3d 1362, 52 USPQ2d 1 129 

(Fed. Cir. 1999). 

The claims are broadly drawn to an inducible promoter, SEQ ID NO: 1 (NIMIN-1 
promoter), biologically active derivatives of SEQ ID NO: 1 of any length and sequence 
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from any source, "chemicals" and "organic compounds" capable of inducing SEQ ID 
NO: 1, and undescribed heterologous nucleic acid sequences encoding wild-type or 
mutant NIMIN-1 promoters. Said sequences include genes encoding promoters from 
any source (Claim 1-14, 20-28), as well as any sequence from any source encoding any 
promoter which has any biological activity of any kind, including irreversible self-excision 
from the genome: 

In contrast, Applicant has only teaches SEQ ID NO: 1 (sequence listing), GUS 
expression data with NIMIN-1 9 (SEQ ID NO: 1)-GUS fusions (Examples 2-3), and 
transgenic plants containing SEQ ID NO: 1-GUS constructs(Examples 2-3). 

Applicant do not teach biologically active derivatives of SEQ ID NO: 1 of any 
length and sequence from any source, "chemicals" (e.g., mercury or water) or "organic 
compounds" (e.g., cyanide or glyphosate) capable of inducing SEQ ID NO: 1, and 
undescribed heterologous nucleic acid sequences encoding wild-type or mutant SEQ ID 
NO: 1 promoters. Said sequences include genes encoding promoters from any source 
(Claim 1-14, 20-28), as well as any sequence from any source encoding any promoter 
which somehow "modulates" expression or activity of a gene of interest. 

The Breadth Of The Claims 

See above. 

The Unpredictability of the Art and the State of the Prior Art 
The state-of-the-art is such that one of skill in the art cannot predict which 
"biologically active derivatives of SEQ ID NO: 1" will work a priori. Reviews by Kim, 
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Hannenhalli, Maiti, and Doelling detail a variety of problems seen in promoter 
identification and modification. 

The state of the prior art, as exemplified by Kim et al (Plant Molecular Biology, 
vol. 24, pp. 105-1 17, 1994)) teaches the extreme sensitivity of promoter regions to 
single base pair changes, the absolute requirement for as few as 3 to 6 nucleotides for 
promoter function, and the failure of a promoter to function either constitutively or 
specifically when lacking oligonucleotide regions approximately 100 bp upstream of the 
transcription start site (page 106, paragraph bridging the columns; paragraph bridging 
pages 107 and 108; page 110, paragraph bridging the columns). In addition, the 
claimed nucleic acid sequence that is a biologically active derivatives of SEQ ID NO: 1 
would comprise non- functional transcriptional and translational elements, i.e. 
modifications of CAAT, TATA and the ATG codon, required for proper initiation of these 
cellular activities, known in the prior art; as well as highly conserved promoter regions 
rendered inactive by modifications. In addition, Applicant has not shown that any 
biologically active derivative of SEQ ID NO: 1 can also have the desired promoter 
activity. 

A recent review by Hannenhalli (2001) Bioinformatics 17: S90-S96) teaches that 
prediction of eukaryotic promoters has been one of the most elusive problems despite 
considerable effort devoted to the study. (See the abstract at least). In the instant case 
of a promoter, Hannenhalli's review teaches a 50% failure in the sensitivity of promoter 
detection (p. S90, last full sentence). 
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Twenty base-pair long regions of a DNA fragment that has promoter activity 
cannot predictably be assumed to also have promoter activity. Deletion analysis of 
various promoters have shown that even DNA segments from the portion of a promoter 
region containing sequence elements thought to be most important (e.g., the TATA-box) 
need to be longer than 20 basepairs. Maiti et al, in studies on a figwort mosaic virus 
promoter, found that the smallest portion upstream of the transcriptional start site that 
would support transcription was 198 basepairs long; segments of 73 and 37 basepairs 
did not work (1997, Transgen. Res., 6:143-156, see Fig. 4). Doelling et al found that the 
minimal rRNA promoter of Arabidopsis thaliana is at least 33 nucleotides long (1995, 
Plant J. 8:683-692, see Fig. 1). 

Guidance in the Specification 

See above. The specification, while suggesting the use of the SEQ ID NO: 1 , 
did not provide significant guidance on how to overcome art recognized problems in 
achieving expression of promoter smaller "biologically active derivatives of SEQ ID NO: 
1" of DNA, including single base pair polynucleotide promoters while still retaining 
activity. 

In addition, since the working examples disclosed in the specification are limited 
to unmodified SEQ ID NO: 1, the chemically inducible activity of sequences cannot be 
extrapolated to any derivatives thereof, absent specific guidance. While Applicant is not 
required to exemplify each and every claimed embodiment, specific guidance as to 
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which region of the disclosed sequences can be modified, truncated so that the s 
chemically inducible activity promoter activity is retained is required. 

The specification has no working examples of sequences other than SEQ ID NO: 
1 and no working examples of any "biologically active derivatives of SEQ ID NO: 1". 

Without sufficient guidance, identification of NIMIN-1 promoters is unpredictable 
and without guidance on how to overcome the problems seen in determining a priori 
which sequences will have NIMIN-1 promoter activity transgenic plants, it is 
unpredictable and the experimentation left to those skilled in the art is unnecessarily 
and improperly extensive and undue. 

Therefore, given the breadth of the claims; the lack of guidance and working 
examples; the unpredictability in the art; and the state-of-the-art as discussed above, 
undue trial and error experimentation would be required to practice the claimed 
invention, and therefore the invention is not enabled throughout the broad scope of the 
claims. 

Claim Rejections - 35 U.S.C. §102 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 

form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

35 U.S.C. §102. 
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14. Claims 1-5 are rejected under 35 U.S.C. 102(b) as being anticipated by 
Sato.S., (2000) Structural analysis of Arabidopsis thaliana chromosome 3. I. Sequence 
features of the regions of 4,504,864 bp covered by sixty P1 and TAC clones. J. DNA 
Res. 7 (2), 131-135. (See Appendix A). Sato discloses a recombinant nucleic acid 
containing at least a first nucleotide sequence operably linked to at least a second 
nucleotide sequence containing a transgene to be expressed, wherein the first 
nucleotide sequence contains a regulatory sequence selected from the group consisting 
of SEQ-ID-No. I, and a biologically active derivative thereof, wherein the regulatory 
sequence is a promoter sequence selectively inducible by chemicals, wherein the 
chemicals are selected from the group consisting of organic compounds, wherein the 
organic compounds are selected from the group consisting of phenolic compounds, 
thiamine, benzoic acid, isonicotinic acid (INA), and derivatives thereof, wherein the 
phenolic compound is salicylic acid or a structural or functional derivative thereof, 
wherein the expression/transcription of said nucleotide sequence results in a detectable 
signal, a vector containing the recombinant nucleic acid according, and a bacterial host 
cell. (Id. @ p. 131, left column, 1 st paragraph). Because Sato discloses cloning bugs 
with cloning vectors, Sato discloses the limitations of the claimed invention. Thus, the 
reference discloses all the limitations of the Claimed invention. 

15. Claims 1-5 are rejected under 35 U.S.C. 102(b) as being anticipated by 
Federspiel et al, (Database EMBL, 7 February 2000, Federspiel: Arabidopsis thaliana 
chromosome I BAG T14P4 genomic sequence, complete sequence. Database 
accession no. AC022521, See Appendix B) discloses a recombinant nucleic acid 
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containing at least a first nucleotide sequence operably linked to at least a second 
nucleotide sequence containing a transgene to be expressed, wherein the first 
nucleotide sequence contains a regulatory sequence selected from the group consisting 
of SEQ-ID-No. I, and a biologically active derivative thereof, wherein the regulatory 
sequence is a promoter sequence selectively inducible by chemicals, wherein the 
chemicals are selected from the group consisting of organic compounds, wherein the 
organic compounds are selected from the group consisting of phenolic compounds, 
thiamine, benzoic acid, isonicotinic acid (INA), and derivatives thereof, wherein the 
phenolic compound is salicylic acid or a structural or functional derivative thereof. Thus, 
the reference discloses all the limitations of the Claimed invention. 

16. Claims 1-14 and 20-28 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Lebel et al discloses chemically inducible genes inducible by salicylic 
acid and uses thereof (Lebel, et al WO 98 03536 A1/1 998, SEQ ID NO: 1 , page 5, 
paragraph 2, lines 2-6). 

Lebel discloses a recombinant nucleic acid containing at least a first nucleotide 
sequence operably linked (See claim 5) to at least a second nucleotide sequence 
containing a transgene to be expressed, wherein the first nucleotide sequence contains 
a regulatory sequence selected from the group consisting of a biologically active 
derivative SEQ-ID-No. I (See sequence listing), wherein the regulatory sequence is a 
promoter sequence selectively inducible by chemicals (See claim 1(a-c)), wherein the 
chemicals are selected from the group consisting of organic compounds (See page 2, 
line 9, 21-25), wherein the organic compounds are selected from the group consisting of 



Application/Control Number: 10/566,201 Page 15 

Art Unit: 1638 

phenolic compounds, thiamine, benzoic acid, isonicotinic acid (INA), and derivatives 
thereof (See abstract, page 2, line 9, 21-25, page 10, line 6-26), wherein the phenolic 
compound is salicylic acid (See abstract) or a structural or functional derivative thereof 
(See abstract, page 2, lines 9, 21-25; page 10, lines 6-26), wherein the 
expression/transcription of said nucleotide sequence results in a detectable signal 
(claim 8; page 9, lines 18-26), a vector (claim 10) containing the recombinant nucleic 
acid, a host organism (claim 10), a bacterial or plant cell (claim 10), a transgenic plant 
wherein the recombinant nucleic acid is stably integrated into the genetic material (claim 
10), wherein the transgene contained in the second nucleotide sequence is transiently 
expressed (See page 13, lines 9-18; note that the expression is measured in minutes), 
and wherein the expression of the transgene contained in the second nucleotide 
sequence is selectively induced upon treatment with chemicals (See page 14, lines 20- 
25). 

Biologically active fragment is interpreted to encompass a single nucleotide base 
pair. 

Thus the reference anticipates the claimed invention. 
17. No Claim is allowed. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed 
to Brendan O. Baggot whose telephone number is 571/272-5265. The examiner can normally be reached on 
Monday - Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Anne Marie 
Grunberg can be reached on 571/272-0975. The fax phone number for the organization where this application or 
proceeding is assigned is 571-273-8300. 
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Result us-10-566-201-2 . rge . 



Go Back to previous page 



GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



September 28, 2006, 01:51:24 ; Search time 11528 Seconds 

(without alignments) 
6800.793 Million cell updates/sec 

US-10-566-201-2 
1226 

1 gatctctatgtatataaaaa ttgactaagcttaaacgacg 1226 

I DENT I T Y_NUC 

Gapop 10.0 , Gapext 1.0 



6366136 seqs, 31973710525 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



12732272 



Database 



GenEmbl : * 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 



gb_env : * 
gb_pat : * 
gb_ph : * 
gb_pl : * 
gb_pr : * 
gb_ro : * 
gb_sts : * 
gb_sy : * 
gb_un : * 
gb_vi : * 
gb_ov : * 
gb_htg : + 
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13: gb_in:* 
14: gb_om:* 
15: gb_ba:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 





No . 


Score 


Match 


Length 


DB 


ID 


Description 




1 


1226 


100 


0 


1226 


2 


CS007929 


CS007929 


Sequence 




2 


1226 


100 


0 


1226 


2 


CS025770 


CS025770 


Sequence 


Q 


3 


1226 


100 


0 


83650 


4 


AB 023 041 


AB023041 


Arabidops 




4 


941 


76 


8 


92620 


4 


AB026636 


AB026636 


Arabidops 




5 


837 


68 


3 


1700 


2 


AR488 14 7 


AR488147 


Sequence 


Q 




797 


8 


65 


1 


83646 


4 


AB00 5 24 8 


AB005248 


Arabidops 




7 


797 


6 


65 


1 


94487 


4 


AC012394 


AC012394 


Arabidops 


Q 


8 


797 


6 


65 


1 


100806 

i. V U U V \J 


4 


AC015450 


AC015450 


Arabidops 


Q 


9 


739 


2 


60 


3 


104386 


4 


ATT3 2A17 


AL161813 


Arabidops 


Q 


10 


739 


2 


60 


3 


179771 


4 


ATCHRIV25 


AL161513 


Arabidops 




11 


691 


56 


4 


95519 


4 


AF071527 


AF071527 


Arabidops 


Q 


12 


691 


56 


4 


116448 


4 


ACOO 5 14 2 


AC005142 


Arabidops 


C 


X J 


691 


56 


4 




4 


ATCHR TVQ 


AL161497 


Arabidops 


c 


1 4 

X H 


553 


2 


45 


1 


QC 1 QO 
J? D X -? U 


4 


arno 79D ~\ 

U U / t U j 


AC007203 


Arabidops 




1 5 


231 


8 


18 


9 


10 5223 


4 


ACOO 73 99 


AC007399 


Arabidops 


c 


X D 


114 


6 


9 


3 


"*4 Q Qft f) 


2 


rtAO 1 *4 J J 3 


AX344555 


Sequence 




X 1 


112 


9 


1 


1 "7 ^ 4 4 
X / 0 D fx fx 


1 0 


API 1 


AC117342 Rattus no 




1 P 
J. O 


111 


2 


9 


1 


a £ £ n 




LoUooOfl J 


CS083843 


Sequence 




1 Q 
x y 


111 


2 


9 


1 




-5 


AT CQOI £ £ 


AL592166 


Human DNA 


c 


z u 


108 


8 


8 


9 


i u y foo 


fx 


r DKz fl 


AF128395 


Arabidops 


c 


Z x 


108 


8 


8 


9 


i on oi 


ft 


A 1 LnKl VJ.3 


AL161507 


Arabidops 




22 


107 


6 


8 


8 


1524 


2 


CSOo 3 838 


CS083838 


Sequence 




23 


107 


6 


8 


8 


212999 


12 


AC151201 


AC151201 Bos tauru 


c 


24 


107 


4 


8 


8 


170627 


12 


AC125567 


AC125567 Rattus no 


c 


25 


106 


4 


8 


7 


15548 


2 


AX347057 


AX347057 


Sequence 




26 


105 


8 


8 


6 


47403 


2 


AX059535 


AX059535 


Sequence 




27 


105 


8 


8 


6 


91470 


4 


T4B21 


AF118223 


Arabidops 




28 


105 


8 


8 


6 


110000 


2 


AR777056_04 


Continuation (5 of 




29 


105 


8 


8 


6 


200001 


4 


ATCHRIV13 


AL161501 


Arabidops 




30 


105 


.6 


8 


6 


154563 


5 


CR936360 


CR936360 


Human DNA 


c 


31 


104 


4 


8 


5 


2131 


2 


CS083950 


CS083950 


Sequence 


c 


32 


104 


4 


8 


5 


172816 


5 


AC093899 


AC093899 


Homo sapi 




33 


104 


2 


8 


5. 


241619 


12 


AC167751 


AC167751 Bos tauru 


c 


34 


103 


.8 


8 


5 


166501 


12 


CR548626 


CR548626 Danio rer 


c 


35 


103 


.2 


8 


4 


7218 


2 


166494 


166494 Sequence 14 


c 


36 


101 


.6 


8 


.3 


143331 


5 


AC091214 


AC091214 


Homo sapi 


c 


37 


100 


.8 


8 


.2 


169510 


12 


CR855864 


CR855864 Danio rer 


c 


38 


100 


. 8 


8 


.2 


178273 


12 


AC005308 


AC005308 Plasmodiu 




39 


100 


.8 


8 


2 


250531 


13 


AE014845 


AE014845 Plasmodiu 




40 


100 


. 6 


8 


2 


155942 


12 


AC136691 


AC136691 Homo sapi 




41 


100 


.6 


8 


2 


161765 


5 


AC113190 


AC113190 


Homo sapi 




42 


100 


. 6 


8 


. 2 


176479 


12 


AC135631 


AC135631 Homo sapi 


c 


43 


100 


.2 


8 


.2 


192265 


5 


CNS018P3 


AL110118 


Human chr 




44 


100 


8 


. 2 


154604 


11 


AL954739 


AL954739 Zebrafish 
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45 



99 . 8 



8.1 67970 13 PFMAL1P3 



AL0 3174 6 Plasmodiu 



ALIGNMENTS 



RESULT 3 

AB023041/C 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



AB023041 83650 bp DNA linear PLN 14-FEB-2004 

Arabidopsis thaliana genomic DNA, chromosome 3, PI clone: MPE11. 
AB023041 BA000014 
AB023041.1 GI:4220640 

Arabidopsis thaliana (thale cress) 
Arabidopsis thaliana 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicotyledons; 
rosids; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 
1 

Sato , S . , Nakamur a , Y . , Kaneko,T., Katoh,T., Asamizu,E. and Tabata,S. 

Structural analysis of Arabidopsis thaliana chromosome 3. I. 

Sequence features of the regions of 4,504,864 bp covered by sixty 

PI and TAC clones 

DNA Res. 7 (2), 131-135 (2000) 

10819329 

2 (bases 1 to 83650) 

Sato,S., Nakamura,Y., Kaneko,T., Kato,T., Asamizu,E. and Tabata,S. 
Direct Submission 

Submitted (01-FEB-1999) Yasukazu Nakamura, Kazusa DNA Research 
Institute, Department of Plant Gene Research; 1532-3, Yana, 
Kisarazu, Chiba 292-0812, Japan (E-mail : ynakamu@kazusa . or . jp, 
Tel: 81-438-52-3 93 5, Fax : 81 -438 - 52-3 934 ) 
Address for correspondence: kaos@kazusa.or.jp 

For the latest information on annotation of this clone, please see 
http : //www, kazusa . or . jp/kaos/cgi-bin/agd_graph. cgi?c=MPEll 
Genes with similarity to proteins in the databases are described in 
•product 1 or 'note* qualifiers. Genes that have no significant 
protein similarity are described as 'unknown protein' . 
The software programs used to predict genes include: Grail 
(Informatics Group, Oak Ridge National Laboratory, 
http : //compbio . ornl . gov/Grail-1 . 3/) , 

GENSCAN (Chris Burge, MIT, http://CCR-081.mit.edu/GENSCAN.html), 
NetGene2 (S.M. Hebsgaard, et al . , CBS, Technical University of 
Denmark, http://www.cbs.dtu.dk/services/NetGene2/) and 
SplicePredictor (Volker Brendel, Stanford University, 
http: //gremlinl . zool . iastate . edu/cgi-bin/sp . cgi) . 
Genes encoding tRNAs are predicted by tRNAscan-SE 
(Sean Eddy, Washington University School of Medicine, St. Louis, 
http: //genome . wustl . edu/eddy/tRNAscan-SE/) . 

This sequence may not be the entire insert of this clone. It may be 
shorter because we remove overlaps between neighboring submissions. 
The 5' clone is K9I22 and the 3' clone is MJL14 . 

Location/Qualifiers 

1. .83650 

/organi sm= " Arabidops i s thai iana " 
/mol_type= "genomic DNA" 
/ db_xr e f = " t axon : 3 7 0 2 " 
/chromosome= " 3 " 
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/clones "MPE11" 
/clone_lib= "Mitsui PI" 
/ecotype= "Columbia" 
exon complement (54 2 . .764) 

/inferences "non-experimental evidence, no additional 
details recorded" 

/note="CDS is reported in Acc# AP000599 

contains similarity to CHP-rich zinc finger protein 

gene_id:K9I22.5" 

/ number =1 

CDS join(1705. .2463,2548. .2739,2862". .2949,3037. .3107, 

3190. .3273,3410. .3514) 

/inferences "non-experimental evidence, no additional 

details recorded" 

/note="gb| AAD5513 9. 1 

gene_id:MPEll.l" 

/codon_start=l 

/product= "dihydrolipoamide S-acetyltransf erase " 
/protein_id= "BAB0 1047 . 1 " 
/db_xref ="GI : 9279589" 

/translations "MTVRSKIREIFMPALSSTMTEGKIVSWIKTEGEKLAKGESWW 
ESDKADMDVETFYDGYLAAIWGEGETAPVGAAIGLLAETEAEIEEAKS KAASKSSSS 
VAEAWPSPPPVTSSPAPAIAQPAPVTAVSDGPRKTVATPYAKKLAKQHKVDIESVAG 
TGPFGRITASDVETAAGIAPSKSSIAPPPPPPPPVTAKATTTNLPPLLPDSSIVPFTA 
MQSAVSKNMIESLSVPTFRVGYPVNTDALDALYEKVKPKGVTMTALLAKAAGMALAQH 
PWNASCKDGKSFSYNSSINIAVAVAINGGLITPVLQDADKLDLYLLSQKWKELVGKA 
RSKQLQPHEYNSGTFTLSNLGMFGVDRFDAILPPGQGAIMAVGASKPTWADKDGFFS 
VKNTMLVNVTADHRIVYGADLAAFLQTFAKIIENPDSLTL" 
CDS complement (4594 . .5106) 

/inference= "non- experimental evidence, no additional 
details recorded" 

/note= "unnamed protein product; gene_id: MPE11 . 2 

unknown protein" 

/codon_start=l 

/protein_id="BAB01048 .1" 

/db_xref ="GI: 9279590" 

/ 1 rans 1 a t i on= " MEDLLEERLTRTDS VGNKRVRDGLDLDS PDVKRLRDDLFDDSGL 
DPVSQDLDSVMKSFENELSTTTAALSSGETQPDLGYLFEASDDELGLPPPLTPPQTLL 
PPSCEETVTELVRASSDSSEVGELCGFEDHVTEFGPCDLGDDGLFEYFDGCLDSGDLF 
SWRPEFLPAE " 

CDS complement(join(10090. .10408,1064 9. .10804,10916. .10962)) 

/inference= "non-experimental evidence, no additional 
details recorded" 

/note= "unnamed protein product; gene_id : MPE11 . 3 
similar to unknown protein 
sp|P42744" 
./codon_start=l 
/protein_id="BAB0104 9. 1" 
/db_xref="GI: 9279591" 

/translations "MMEPKAKYDRQLMYTIQGTLEEASICLLNCGPIGSNALKNLVLG 
GVGSITIVEGSKVLIGDIWKQFHRHAIEQKFSISEGFRDENNTVFQRREQHSVFQRQL 
EQNRIAGQTVRPMDIARRIWASIGRMWSLADRTTTYGKEARAITDPPFGRVLARLTSL 
VDRHLDFRRLCDI " 
CDS complement (11450 . .11818) 

/inference= "non-experimental evidence, no additional 
details recorded" 

/note= "unnamed protein product; gene_id : MPE11 . 4 
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unknown protein" 

/codon_start=l 

/prote in_i d= " BABO 1050.1" 

/db_xref ="GI : 9279592" 

/ translation= "MNNSLKKEERVEEDNGKSDGNRGKPSTEWRTVTEEEVDEFFKI 
LRRVHVATRTVAKVNGGVAEGELPSKKRKRSQNLGLRNS LDCNGVRDGEFDEINRVGL 
QGLGLDLNCKPE PDS VS LS L " 
CDS 16981. .17979 

/inference= "non-experimental evidence, no additional 
details recorded" 
/note= "gene_id : MPEll . 5 " 
/codon_start=l 

/product="AP2 domain transcription factor-like protein" 
/protein_id="BAB01051 . 1" 
/db__xref ="GI: 9279593" 

/ 1 r ans 1 a t i on= " MAERKKRS S IQTNKPNKKPMKKKP FQLNHL PGLS EDLKTMRKLR 
FWNDPYATDYSSSEEEERSQRRKRYVCEIDLPFAQAATQAESESSYCQESNNNGVSK 
TKISACSKKVLRSKASPWGRSSTTVSKPVGVRQRKWGKWAAEIRHPITKVRTWLGTY 
ETLEQAADAYATKKLE FDALAAATS AAS S VLSNES GSMI S ASGS S I DLDKKLVDSTLD 
QQAGES KKAS FDFDFADLQI PEMGCFI DDS FI PNACELDFLLTEENNNQMLDDYCGI D 
DLDI IGLECDGPSELPDYDFSDVEIDLGLIGTTIDKYAFVDHIATTTPTPLNIACP " 
CDS join(21893. .22063,22330. .22386,22578. .22783,228 94. .23227, 

23311. .23382,23473. .23613) 

/inference= "non-experimental evidence, no additional 

details recorded" 

/note= "gb | AAF23821 . 1 

gene_id: MPEll .6" 

/codon_start=l 

/product= "homocysteine S-methyltransf erase AtHMT-1" 
/protein_id="BAB01052 . 1" 
/db_xref ="GI : 9279594 » 

/translations "MVLEKKSALLEDLIKKCGGCAWDGGFATQLEIHGAAINDPLWS 
AVSLIKNPELIKRVHMEYLEAGADIWTSSYQATIPGFLSRGLSIEESESLLQKSVEL 
AVEARDRFWEKVS KVSGHSYNRALVAASIGSYGAYLADGSEYSGHYGENVSLDKLKDF 
HRRRLQVLVEAGPDLIjAFETIPNKLEAQACVELLEEEICVQIPAWICFTSVDGEKAPSG 
ESFEECLEPLNKSNNIYAVGINCAPPQFIENLIRKFAKLTKKAIWYPNSGEVWDGKA 
KQWLPSQCFGDDEFEMFATKWRDLGAKLIGGCCRTTPSTINAISRDLKRR" 
CDS complement (27277. .27552) 

/inferences "non-experimental evidence, no additional 
details recorded" 

/notes "unnamed protein product; gene_id : MPE1 1 . 7 

unknown protein" 

/codon_start=l 

/protein_id="BAB01053 .1" 

/db_xref="GI: 9279595" 

/translations "MTHAREWRSSLTTTLLMVILLSYMLHLFCVYSRVGAIRIFPETP 
ASGKRQEEDLMKKYFGAGKFPPVDSFVGKGISESKRIVPSCPDPLHN" 
CDS 30765. .31883 

/inferences "non-experimental evidence, no additional 
details recorded" 

/note= "unnamed protein product; gb | AAF16548 . 1 

gene_id : MPE1 1 . 8 

similar to unknown protein" 

/codon_s tarts l 

/protein_id= "BABO 10 54 . 1" 

/db_xref ="GI : 9279596" 

/translations "MPKERKERSVSLDKYKRSPLCCEASLALKPSEKQVKEWEEARCP 
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VCMEHPHNGILLICSSYENGCRPYMCDTSHRHSNCFDQFRKASKEKPSLSLLREEEES 
NEPTEMEDVDSDSTAVNLLGEAASEITWDLSDGERGEEEVEEEEEEVWEEEEEGIV 
TTEEDQEKNKPQKLTCPLCRGHIKEWVWKAARCFMNSKHRSCSCETCDFSGSYSDLR 
KHARLLHPGVRPSEADPERQRSWRRLERQSDLGDLLSTLQSSFGGDEISNDDGFLFAD 
TRLLTVYFLIRVFRPESSGSRSSSWSGTSRARTHTSGRRRSSRPASLWGESYEGNTGT 
SPRDEENNQSSDEQVSGTRRRRSRRRTVIDDDDEEEEP" 
CDS complement (join (32197 . .32430,32525. .32676,32969. .33332, 

33404. .33487)) 

/inferences "non-experimental evidence, no additional 
details recorded" 
/note="gene_id:MPEll .9" 
/codon_start=l 

/product^ "chloroplast SOS ribosomal protein L15" 
/protein_id= "BAB01055. 1" 
/db_xref="GI: 9279597" 

/ translation= "MATPLSISSNPLTSRHCYRLHLSSTSFKGNVSVLGANPSQILSL 
KLNQTLKTRNQQQFARPLVWSQTAATSSAWAPERFRLDNLGPQPGSRKKQKRKGRG 
ISAGQGAS CGFGMRGQKSRSGPGIMRGFEGGQTALYRRLPKLRGIAGGMRSGLPKYLP 
VNIKDIETAGFQEGDEVSLETLKQKGLINPSGRERKLPLKILGTGELSMKLTFKARAF 
STQAKEKLEASGCTLTVLPGRKKWVKPSVAKNQARADEYFAKKRAAAAEAATSEPAAS 
A" 

CDS complement (join(33961. .34021,34114. .34226,343 91. .34482, 

34602. .34800)) 

/inference^ "non-experimental evidence, no additional 
details recorded" 

/note= "unnamed protein product; gb | AAF26483 . 1 

gene_id:MPEll . 10 

similar to unknown protein" 

/codon_start=l 

/protein_id="BAB01056 .1" 

/db_xref="GI: 9279598" 

/translation= "MKNVMLIIDESNASYDLLIWALENQKDTIESSKVYIFAKQPQNS 
FTPPTVLSSSVGFAQIFYPFSPNSELIRLAQEKNMKIALGILEKAKKICLNHGIKAET 
FTDDGDPKDLIRKIIQERNINLIVTSDQQSLKKCTQNTDCSLLWKKRLRKD" 
CDS join(36033. .36150,3624 3. .36283,36497. .3656 5,36651. .36707, 

36797. .36871) 

Query Match 100.0%; Score 1226; DB 4; Length 83650; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1226; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 GATCTCTATGTATATAAAAATATQGGTAATATTAGAAACTAACTATGAAATGGAAAAGAA 60 

I I I I I I II I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1306 9 GATCTCTATGTATATAAAAATATGGGTAATATTAGAAACTAACTATGAAATGGAAAAGAA 13010 

Qy 61 TTGAGAGAATGACATTGTGTCAGAAAAGTTAGGTAAATAACATTTTCTGAAAAAGAGAAA 120 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 1 M I I I I I I I I I I I I I I I I I 

Db 1300 9 TTGAGAGAATGACATTGTGTCAGAAAAGTTAGGTAAATAACATTTTCTGAAAAAGAGAAA 129 50 

Qy 121 ATACAAAAATATCCTTGTGTTTACTTATTTTTACAATAATGCCATTGGCTTTAGTTATAA 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 12949 ATACAAAAATATCCTTGTGTTTACTTATTTTTACAATAATGCCATTGGCTTTAGTTATAA 12890 

Qy 181 AGTTTATATGTATTTGTCTAAAATAGCATGATATATTTACAAAAATCATGCAATTCTTTA 240 

I I I II I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I 
Db 1288 9 AGTTTATATGTATTTGTCTAAAATAGCATGATATATTTACAAAAATCATGCAATTCTTTA 128 3 0 
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Qy 
Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db • 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 



241 AAATACATACAGAATATATATACACGATATATATGTTTCTCTGAAATAATGTGTTTCTCA 300 

1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II M I Ml II I 

1282 9 AAATACATACAGAATATATATACACGATATATATGTTTCTCTGAAATAATGTGTTTCTCA 12770 



301 



360 



GAAATAGCACGAAATATTTATAAAAAGCATGCAATTCTCTTATAGATCGCGAAGTTTAAA 

1 1 II II I M 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 II 1 1 Ml I 

12 76 9 GAAATAGCACGAAATATTTATAAAAAGCATGCAATTCTCTTATAGATCGCGAAGTTTAAA 12 710 



361 



420 



AAAACATATAGAATTGTTACAATATTACATGGGTTTTTATTGGATAACATGACAAATATT 

1 1 Ml I III I Ml II II Mil Ml MM I Ml Mill Ml M II IIIMM 1 1 II I III I 

12 709 AAAACATATAGAATTGTTACAATATTACATGGGTTTTTATTGGATAACATGACAAATATT 12650 



421 



480 



TATTTATTTCATGAGTTTTTATTGGATAGCATGACAAATATTAATATATCAGTGTTAATA 

I II I M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II I II 1 1 1 1 1 MM 1 1 M M 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 

1264 9 TATTTATTTCATGAGTTTTTATTGGATAGCATGACAAATATTAATATATCAGTGTTAATA 125 90 



481 



540 



ACATGTTTTGTTCTTAAAATACATGCATTTTAAAATCAGACATTTGTTTTAAAATCAAAT 

I I I I 1 I 1 I I I I I I t I I I I I I I I I I I I 1 I 1 I I I I I 1 I I 1 I I I I 1 I I I I t I I 1 I I I I I I I I t 
12 58 9 ACATGTTTTGTTCTTAAAATACATGCATTTTAAAATCAGACATTTGTTTTAAAATCAAAT 12530 



541 



600 



CTAATCTCTTATATCACAACGACATTGACGGAAAATTCAGGTAAAAAGAGAAAATAAAGA 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I . 
12 529 CTAATCTCTTATATCACAACGACATTGACGGAAAATTCAGGTAAAAAGAGAAAATAAAGA 12470 



601 



660 



ATGAGAGATAGAGAGATTTCTATGGAAAAAGAAAGAGAGAACATGTAGGTGAACAAAATA 

II 1 1 1 1 1 1 1 M I II 1 1 1 1 II 1 1 II I III 1 1 1 II II I Ml 1 1 M I II I M 1 1 1 II 1 1 1 1 1 1 

124 6 9 ATGAGAGATAGAGAGATTTCTATGGAAAAAGAAAGAGAGAACATGTAGGTGAACAAAATA 12410 



661 



720 



AAGAGATATGATGATATATTTTATGAGAGGTGGTGAAGATTATTTTAGGAGAGGGAGAGA 

M II 1 1 1 1 1 1 II 1 1 M I II 1 1 Ml M I M 1 1 M II 1 1 II II I II II II II 1 1 1 Ml M II 

1240 9 AAGAG ATATGATGATATATTTTATGAGAGGTGGTGAAGATTATTTTAGGAG AGGG AGAG A 123 50 



721 



780 



GAAATAGAAAAAGAAAATGACATGGTGAATCTGAAGAAGATGAATTGTGTTAAAGATGAA 

! 1 1 1 1 II I M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 M M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 

12 34 9 GAAATAGAAAAAGAAAATGACATGGTGAATCTGAAGAAGATGAATTGTGTTAAAGATGAA 122 90 



781 



840 



GAGAGAAAGAGAACTCCATGGCTAAAGTCTCGTAAAGAAGATGAAAAAGAAACAAAAGAA 

I MM I III MM IM III Mill III I III MINIM MINIM III I III MM I 

1228 9 GAGAGAAAGAGAACTCCATGGCTAAAGTCTCGTAAAGAAGATGAAAAAGAAACAAAAGAA 1223 0 



841 



900 



GGAAGAAGAAAGAGAAAGGCTAAAATAGACTAACTATTGCCAAAATTTCTGTAGCCGACA 

1 1 1 1 1 N 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II II 1 1 II 1 1 II II 1 1 1 II II II II 1 1 1 1 1 1 1 1 II 

12 229 GGAAGAAGAAAGAGAAAGGCTAAAATAGACTAACTATTGCCAAAATTTCTGTAGCCGACA 12170 



901 



960 



AATACTATTTGGTCCAAGGTTATTTTGTGTATTCTTTTGAAGTCAAAAGTTATTTCTTAC 

1 1 Ml 1 1 II MM MM III MM I MM III I III MMMM II Mill I MM MM 

1216 9 AATACTATTTGGTCCAAGGTTATTTTGTGTATTCTTTTGAAGTCAAAAGTTATTTCTTAC 12110 
961 ATATACTCTAAAAATATAGCCGATACCAATTTTTCCACACATGGACTTCCTTTATTCCAA 1020 

1 1 III I II IIIIMIII III MM I III I III IMMIMMIIIIM III MM Mill 

1210 9 ATATACTCTAAAAATATAGCCGATACCAATTTTTCCACACATGGACTTCCTTTATTCCAA 120 50 
1021 AAGTCAATAAAGTGTGACGTCATGATACTTACGCTTTAAAACATCGCATGATGATGTCAT 108 0 

II MM Ml MM II M MM MM MM MM IIIMM MM MM II II MM MM 

12 04 9 AAGTCAATAAAGTGTGACGTCATGATACTTACGCTTTAAAACATCGCATGATGATGTCAT 119 90 
1081 TAGCATC AATCTCCACCGTCCAATTTATTTAGTTGTTGACAATATCGACCGTCTAAGTTC 114 0 
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II 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I II 1 1 1 1 II 1 1 1 1 1 1 1 1 II 

Db 1198 9 TAGCATCAATCTCCACCGTCCAATTTATTTAGTTGTTGACAATATCGACCGTCTAAGTTC 11930 

Qy 1141 CACACCGACGGCTATAAGAGTTTCATTATAAATTTTAGCAAAATAAAATCAGCAAATAAT 1200 

IIIIIIIIIIIIIIIIIIIMIII Mill III Mill Mill IMIMIIMIIMMI I 

Db 11929 CACACCGACGGCTATAAGAGTTTCATTATAAATTTTAGCAAAATAAAATCAGCAAATAAT 11870 



Qy 1201 TTTTTCTTGACTAAGCTTAAACGACG 1226 

1 1 1 1 1 i 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 11869 TTTTTCTTGACTAAGCTTAAACGACG 11844 
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APPENDIX B 

SCORE Search Results Details for Application 10566201 and Search Result us-10-566-201- 
1 . rge . 

Score Home Page Retrieve Application List SCORE System Overview SCORE FAQ Comments 
/ Suggestions 



This page gives you Search Results detail for the Application 10566201 and Search 
Result us-10-566-201-1 . rge . 



Go Back to previous page 



GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



October 19, 2006, 02:01:17 ; Search time 6423 Seconds 

(without alignments) 
10284.553 Million cell updates/sec 

US-10-566-201-1 
1033 

1 gaattcgtggtatagcgtta aaatcaatcactttctctaa 1033 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 6366136 seqs, 31973710525 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



12732272 



Database 



GenEmbl : * 

1 : gb_env : * 

2 : gb_pat : * 

3 : gb_ph : * 

4 : gb_pl : * 

5 : gb_pr : * 

6 : gb_ro : * 

7: gb_StS:* 

8 : gb_sy : * 

9 : gb_un : * 
10: gb_vi:* 
1 1 : gb_OV : * 
12: gb_htg:* 
13: gb_in:* 
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14: gb_om:* 
15: gb_ba:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 





No . 


Score 


Match 


T iPna t* h 


DB 


ID 


Description 




1 


1033 


100. 


. 0 


1033 


2 


CS007928 


CS007928 


Sequence 




2 


1033 


100, 


.0 


1033 


2 


CS025769 


CS025769 


Sequence 


Q 


3 


1033 


100. 


. o 


109180 


4 


AC064879 


AC064879 


Arabidops 




4 


1033 


100. 


.0 


121668 

1 il. 1 U v U 


4 


AC022521 


AC022521 


Arabidops 




5 


144 . 


4 


14 . 


, o 


98461 


4 


ATT6H2 0 


AL096859 


Arabidops 






142 . 


6 


13. 


8 


100906 

-L \J \J ^ \J \J 


4 


ATF24G16 


AL138647 


Arabidops 




7 


142. 


4 


13 . 


8 


8 9219 


4 


ATT6K22 


AL031187 


Arabidops 




Q 
O 


142 . 


4 


13 . 


8 


198 715 


4 


ATPHRTVR4 


AL161554 


Arabidops 


c 


Q 


140 . 


6 


13, 


6 


99254 


4 


AP002423 

V— . uu Alt. J 


AC002423 


Genomic s 




x u 


140. 


4 


13, 


6 


1 OORfi 7 

-L U v O D / 


4 


AfOOfi 04 7 


AC008047 


Genomic s 




X X 


140 . 


2 


13 . 


6 


78181 


4 


Ann 114 7 7 

t\ E5 w X X 1 / f 


AB011477 


Arabidops 




12 


139. 


6 


13 . 


5 


102299 


4 


AC018 908 


AC018908 


Arabidops 




13 


137 . 


2 


13 . 


. 3 


84 544 


4 


AB00523 9 


AB005239 


Arabidops 




14 


135 . 


8 


13. 


, 1 


84 544 


4 


AB00 52 3 9 


AB005239 


Arabidops 




X Z3 


135 


13 . 


, 1 


84872 


4 


AP006593 


AC006593 


Arabidops 


C 


X D 


134 . 


8 


13 . 


, 0 


100815 


4 


rt i r xzrixz 


AL133314 


Arabidops 




1 7 
X / 


133 . 


8 


13 . 


.0 




4 




AL161537 


Arabidops 




18 


133 . 


8 


13 . 


. 0 


9 0 0 R7fi 

Z \J \J J f o 


4 


ATFCAO 


Z97335 Arabidopsis 




1 9 


133 . 


2 


12 . 


, 9 


13 8 3 


2 


AX 3 2 8 8 7 7 


AX328877 


Sequence 




9 0 
z u 


133 . 


2 


12. 


. 9 


i ^ ft ^ 

X .3 O J 


9 
z 


rtAJ JJDUO 


AX339608 


Sequence 




Z J. 


133. 


2 


12. 


, 9 


O X *i *i 


** 


nD U X j? Z j D 


AB019236 


Arabidops 


c 


9 9 
z z 


133 . 


, 2 


12. 


, 9 


X v U Z J X 


*± 


r\irzzuxz 


AL391734 


Arabidops 


c 


Z o 


130. 


, 4 


12. 


. 6 


/ jUU? 


/i 
*± 




AC007069 


Arabidops 


c 


24 


130. 


,4 


12 , 


.6 


/ /4oi 


4 


Ao U z o b U / 


AB028607 


Arabidops 


c 


25 


130. 


, 4 


12. 


.6 


a a tz a o 

99b 9 a 


4 


AC U U o z b z 


AC008262 


Genomic s 


c 


26 


128 . 


, 6 


12. 


.4 


C C £ Q C 

55686 


4 


ALU U /zd / 


AC007267 


Arabidops 




27 


128 , 


. 6 


12. 


.4 


84413 


4 


AC069325 


AC069325 


Genomic S 




28 


128 


.2 


12 . 


. 4 


89811 


4 


AC018849 


AC018849 


Arabidops 


c 


29 


128 


. 2 


12. 


. 4 


132990 


4 


AC018848 


AC018848 


Arabidops 




30 


127 , 


.4 


12. 


.3 


62420 


4 


AB025622 


AB025622 


Arabidops 




31 


127 , 


. 2 


12. 


.3 


92612 


4 


AC003974 


AC003974 


Arabidops 


c 


32 


127 


12 


. 3 


115851 


4 


AC002505 


AC002505 


Arabidops 


c 


33 


126 


12 


.2 


93212 


4 


AC005936 


AC005936 


Arabidops 


c 


34 


125 


.8 


12 


.2 


43570 


4 


AB019231 


AB019231 


Arabidops 




35 


125 


. 8 


12 


. 2 


95295 


4 


AC020889 


AC020889 


Genomic s 


c 


36 


125 


. 8 


12 


.2 


102078 


4 


F11A17 


AC007932 


Arabidops 




37 


125 


. 6 


12 


.2 


86536 


4 


AP002047 


AP002047 


Arabidops 


c 


38 


125 


.6 


12 


. 2 


105768 


4 


AC069474 


AC069474 


Arabidops 


c 


39 


124 


. 8 


12 


. 1 


100328 


4 


ATF18L15 


AL133298 


Arabidops 


c 


40 


124 


.4 


' 12 


. 0 


92652 


4 


AC024261 


AC024261 


Arabidops 




41 


124 


12 


. 0 


75948 


4 


AC037424 


AC037424 


Arabidops 




42 


124 


12 


. 0 


76976 


4 


AC006532 


AC006532 


Arabidops 




43 


123 


. 8 


12 


. 0 


2004 


2 


AX461295 


AX461295 


Sequence 




44 


123 


. 8 


12 


.0 


88149 


4 


AC006201 


AC006201 


Arabidops 




45 


123 


. 8 


12 


. 0 


103889 


4 


ATT24C20 


AL096856 


Arabidops 
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ALIGNMENTS 



RESULT 3 

AC064879/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



JOURNAL 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 



AC064879 109180 bp DNA linear PLN 19-AUG-2000 

Arabidopsis thaliana chromosome I BAC T6A9 genomic sequence, 
complete sequence. 
AC064879 

AC064879.3 GI:7958959 
HTG . 

Arabidopsis thaliana (thale cress) 
Arabidopsis thaliana 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta; 
Spermatophyta ; Magnoliophyta ; eudicotyledons ; core eudicotyledons; 
rosids; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 

1 (bases 1 to 109180) 

Federspiel , N . A. , Palm, C. J,, Conway, A. B. , Conn,L., Hansen, N.F., 
Altafi,H., Nguyen, M . , Lam,B., Southwick , A . , Miranda, M. , Brooks, S., 
Buehler,E., Chao,Q., Chin,C, Chiou,J., Choi,E., Gonzalez, A., 
Howng,B., Johnson-Hopson, C . , Khan,S., Kim,C, Koo,T., Lee,J.M., 
Lenz,C, Liu, A., Liu,S., Mukharsky , N . , Pham,P., Sakano,H., 
.Shinn,P., Toriumi,M., Vaysberg,M., Yu,G., Ecker,J,, Theologis,A. 
and Davis, R.W. 
Unpublished 

2 (bases 1 to 109180) 

Federspiel , N .A. , Palm, C. J., Conway, A. B. , Conn,L., Hansen, N.F., 
Altafi,H., Nguyen, M . , Lam,B., Southwick, A. , Bei,Q., Buehler,E., 
Chin,C, Chiou,J., Choi,E., Dunn, P., Gonzalez, A., Howng,B., Kim,C, 
Koo,T., Lee,J.M., Lenz,C, Li, J., Liu, A. , Liu, K. , Liu,S., 
Mukharsky, N. , Pham,P., Sakano,H., Schwartz, J., Shinn,P., 
Thaveri,A., Toriumi,M., Vaysberg,M., Walker, M . , Yu,G., Ecker,J., 
Theologis,A. and Davis, R.W. 
Direct Submission 

Submitted (24 -APR-2000 ) DNA Sequencing and Technology Center, 
Stanford University, 855 California Avenue, Palo Alto, CA 94304, 
USA 

3 (bases 1 to 109180) 

Federspiel , N . A. , Palm, C. J., Conway, A. B. , Conn,L., Hansen, N.F., 
Altafi,H., Nguyen, M. , Lam,B., Southwick , A . , Bei,Q., Buehler,E., 
Chin,C, Chiou,J., Choi,E., Dunn, P., Gonzalez, A., Howng,B., Kim,C, 
Koo,T., Lee,J.M., Lenz,C, Li, J., Liu, A., Liu , K . , Liu,S., 
Mukharsky , N . , Pham,P., Sakano,H., Schwartz, J., Shinn,P., 
Thaveri,A., Toriumi,M., Vaysberg,M., Walker, M. , Yu,G., Ecker,J., 
Theologis,A. and Davis, R.W. 
Direct Submission 

Submitted ( 20 -MAY-2000 ) DNA Sequencing and Technology Center, 
Stanford University, 855 California Avenue, Palo Alto, CA 94304, 
USA 

4 (bases 1 to 109180) 

Federspiel ,N. A. , Palm, C. J., Conway, A. B., Conn,L., Hansen, N.F., 
Altafi,H., Nguyen, M., Lam,B., Southwick , A . , Ecker,J., Theologis,A. 
and Davis , R . W . 
Direct Submission 

Submitted ( 19-AUG-2000 ) DNA Sequencing and Technology Center, 
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Stanford University, 855 California Avenue, Palo Alto, CA 94304, 
USA 

COMMENT On May 20, 2000 this sequence version replaced gi : 7922104. 

Bases 1-51,564 of clone T6A9 overlap with bases 55,373-106,937 of 
BAC clone T7I23 (gb|U89959) and bases 91,185-109,180 overlap with 
103,674-121,668 of BAC clone T14P4 (gb | AC022521) . 
FEATURES Location/Qualifiers 
source 1. .109180 

/organism^ "Arabidopsis thaliana" 
/mol_type= "genomic DNA" 
/db_xref = " taxon : 3 702 " 
/chromosome= " I " 
/clone="T6A9" 
misc_f eature 1. .51564 

/note= "overlap with bases 55,373-106,937 of BAC clone 
T7I23 (gb|U89959) , see Genbank entry for BAC clone T7I23 
for annotation in this region." 
gene complement {53259. .55668) 

/gene="T6A9.1" 

CDS complement (join (53259. .53652,53748. .53 956,54 037. .54159, 

54519. .54698,55339. .55668)) 
/gene="T6A9.1" 

/note="Similar to mannan endo-1, 4-beta-mannosidase" 
/codon_start=l 
/protein_id= "AAG00883 .1" 
/db_xref="GI: 9857528" 

/ translation^ "MLNILPFFLFFLPFLIGNNRICVAVKTGFVGRNGTQFVLNGEQV 
Y LNGFNAY WMMTT AADT AS KGR AT VTTALRQAS AVGMNVAR IWGFNEGDYIPLQISPG 
SYSEDVFKGLDFWYEAGRFNIKLIISLVNNFEDYGGRKKYVEWAGLDEPDEFYTNSA 
VKQFYKNHVKTVLTRKNTITGRMYKDDPTIFSWELINEPRCNDSTASNILQDWVKEMA 
SYVKS IDSNHLLEIGLEGFYGES I PERTVYNPGGRVLTGTDFITNNQI PDIDFATIHI 
YPDSWLPLQSSRTGEQDTFVDRWIGAHIEDCDNIIKKPLLITEFGKSSKYPGFSLEKR 
NKFFQRVYDVIYDSARAGGSCTGGVFWQLTTNRTGLLGDGYEVFMQAGPNTTAQLIAD 
QSSKLKNLKYPPLVTHSAE " 
gene complement ( 57638 . .58537) 

/gene="T6A9.2" 

CDS complement (join(57638 . .57648,57717. .57816,578 92. .58015, 

58139. .58339,58437. .58537)) 
/gene="T6A9.2" 

/note= "Hypothetical protein" 
/codon_start=l 
/protein_id="AAG00884 .1" 
/db_xref="GI : 9857529" 

/translation^ "MPPKRNFRKRSFEEEEEDNDVNKAAISEEEEKRRLALEEVKFLQ 
KLRERKLGI PALSSTAQSSIGKVKPVEKTETEGEKEELVLQDTFAQETAVLIEDPNMV 
KYIEQELAKKRGRNIDDAEEVENELKRVEDELYKIPDHLKVKKRSSEESSTQWTTGIA 
EVQLPIEYEYILNHRYTS " 
gene complement ( 58 995 . .59892) 

/gene="T6A9 . 3 " 

CDS complement (join (58995. .59539,59778. .59892)) 

/gene="T6A9.3" 

/note= "Similar to germin proteins; Similar to germin, 

oxalate oxidase proteins" 

/codon_start=l 

/protein_id= "AAG00885 .1" 

/db_xref ="GI : 9857530" 

/ translation= "MMNSRISIIIALSCIMITSIRAYDPDALQDLCVADKSHGTKLNG 
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FPCKETLNITESDFFFAGISKPAVINSTMGSAVTGANVEKIPGLNTLSVSLARIDYAP 
GGLNPPHTHPRATEWYVLEGELEVGFITTT^NKLFTKTIKIGEVFVFPRGLVHFQKNN 
GKS PASVLSAFNSQLPGTASVAATLFAAEPALPEDVLTKTFQVGSKMVDKIKERLATK 
K" 

gene complement (60812 . .62776) 

/gene="T6A9.4" 

mRNA complement (join(60812 . . 61263., 61748 . .61813,61970. .62102, 

62202. .62568,62700. .62776)) 
/gene="T6A9.4" 

CDS complement (join(61091. .61263,61748. .61813,61970. .62102, 

62202. .62568,62700. .62701)) 
/gene="T6A9.4" 
/note= "Unknown protein" 
/codon__start=l 
/protein_id= M AAG00886. 1" 
/db_xref="GI: 9857531" 

/translations "MSNNQAFMELGWRNDVGSLAVKDQGMMSERARSDEDRLINGLKW 
GYGYFDHDQTDNYLQIVPEIHKEVENAKEDLLVWPDEHSETDDHHHIKDFSERSDHR 
FYLRNKHENPKKRRIQVLSSDDESEEFTREVPSVTRKGSKRRRRDEKMSNKMRKLQQL 
VPNCHKVDGQGFGSRQDHRVYEKPSTSTSDDVNSGGESLFSSGDIRIWNAQPHADGNG 
FGSRPKSGESHDAI AANSGVKLAITTVY " 
gene complement (67154 . .68132) 

/gene="T6A9.5" 

CDS complement (join (67154 . .67523,67684. .68132)) 

/gene="T6A9.5" 
/codon_start=l 

/product^ "Putative chitinase" 
/protein_id="AAG0 088 7 . 1" 
/db_xref ="GI : 98 57 53 2" 

/ trans la t ion= " MAQQHS FLLLCFFLS I S YLLS SAQTEATS I ERLVPRDLYNKI FI 
HKDNTACPANGFYTYESFVQATRRFPRFGSVGSPVTQRLEVAAFLAQISHETTGGWAT 
APDGPYAWGLCFKEEVSPQSTYCDSSDTQWPCFPNKTYQGRGPIQLSWNYNYGPAGRA 
LGFDGLRNPETVSNNSVIAFQTALWFWMTPQSPKPSCHDVMIGKYRPTAADLAANRTG 
GFGLTTNI INGGLECGI PGDGRVNDRIGFFQRYTGLFKVATGPNLDCENQRPYA " 

gene 69532. .71399 

/gene="T6A9.6" 

CDS join(69532. .69903,70158. .71399) 

/gene="T6A9 . 6 " 

/note= "Hypothetical protein" 
/codon_start=l 
/protein_id="AAG00888 .1" 
/db_xref="GI: 9857533" 

/translations "MNFRNLIASGSRLGKRFCATVFAPASATGIVEASVSSPAAANW 
EASVSSPAAENGVRTSVAAPTVASRQRELYKKLSMLSVTGGTVAETLNQFIMEGITVR 
KDDLFRCAKTLRKFRRPQHAFEIFDWMEKRKMTFSVSDHAICLDLIGKTKGLEAAENY 
FNNLDPSAKNHQSTYGALMNCYCVELEEEKAKAHFEIMDELNFVNNSLPFNNMMSMYM 
RLSQPEKVPVLVDAMKQRGISPCGVTYSIWMQSCGSLNDLDGLEKIIDEMGKDSEAKT 
TWNTFSNLAAIYTKAGLYEKADSALKSMEEKMNPNNRDSHHFLMSLYAGISKGPEVYR 
VWESLKKARPEVNNLS YLVMLQAMS KLGDLDGI KKI FTE WES KCWAYDMRLANI AINT 
Y LKGNMY E E AE K I LDGAM KKS KG P FS KARQ LLM I H LLEND KADLAM KH LE AA VS DS AE 
NKDEWGWSSELVSLFFLHFEKAKDVDGAEDFCKILSNWKPLDSETMTFLIKTYAAAEK 
TS PDMRERLSQQQI EVS EEIQDLLKTVCP " 
gene 72830. .73866 

/gene="T6A9 .7" 

CDS join(72830. .73319,73541. .73685,73 806. .73866) 

/gene="T6A9. 7" 

/note= "Hypothetical protein" 
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/codon_start= 1 
/protein_id= » AAG00889 . 1 " 
/db_xref ="GI : 9857534" 

/ 1 rans 1 a t i on= " MDQMLSGEQDLE VDI EAGRSDVTQESTSDTVSGNGVWS ERANFG 

VSEKIADDLSYPLIRDENRVETSSQSLDLSEKKCGNGKFKKSRKASKPPRPPKGPSLS 

ENDRKIMRDIQELAMRKRARIERMKKSLKRLKAAKTSPSSPCITIFSMIITAIFFAFL 

VFQGFSTGSSSMNSDKS PAPTVSPNNQMISVQFYNDFAPVEQTDPSPTTSLRYTRKRI 

SGAEEEDSRDVTR " 
gene 75937. .78179 

/gene= M T6A9.8" 
CDS join(75937. .7663 5,77286. .78179) 

/gene= M T6A9.8 M 

/note= "Unknown protein' 1 

/codon_start=l 

/protein_id="AAG00890 . 1 11 

/db_xref="GI: 9857535" 

/ translat ion= "MSGNKISTLQALVFFLYRFFILRRWCHRSPKQKYQKCPSHGLHQ 
YQDLSNHTLIFNVEGALLKSNSLFPYFMWAFEAGGVIRSLFLLVLYPFISLMSYEMG 
LKTMVMLSFFGVKKESFRVGKSVLPKYFLEDVGLEMFQVLKRGGKRVAVSDLPQVMID 
VFLRDYLEIEVWGRDMKMVGGYYLGIVEDKKNLEIAFDKWQEERLGSGRRLIGITS 
FNSPSHRSLFSQFCQEIYFVRNSDKKSWQTLPQDQYPKPLIFHDGRLAVKPTPLNTLV 
LFMWAPFAAVLAAARLVFGLNLPYSLANPFLAFSGIHLTLTVNNHNDLISADRKRGCL 
FVCNHRTLLDPLYISYALRKKNMKAVTYSLSRLSELLAPIKTVRLTRDRVKDGQAMEK 
LLSQGDLWCPEGTTCREPYLLRFSPLFSEVCDVIVPVAIDSHVTFFYGTTASGLKAF 
DPIFFLLNPFPSYTVKLLDPVSGSSSSTCRGVPDNGKVNFEVANHVQHEIGNALGFEC 
TNLTRRDKYL I LAGNNG WKKK " 
gene 81980. .84407 

/gene="T6A9. 9" 

CDS join{81980. .82380,83736. .84078,84162. .84407) 

/gene="T6A9. 9" 
/note= "Unknown protein" 
/codon_start=l 
/protein_id="AAG00891 . 1" 
/db_xref="GI: 9857536" 

/ translation= "MVLPSSTPLQTTGKKTISSPEYNFPVIDFSLNDRSKLSEKIVKA 
CEVNGFFKVINHGVKPEIIKRFEHEGEEFFNKPESDKLRAGPASPFGYGCKNIGFNGD 

Query Match 100.0%; Score 1033; DB 4; Length 109180; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1033; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 GAATTCGTGGTATAGCGTTACTTAATAACAATTATAAACTGTAAAATATAAATATTTTAT 6 0 

I II I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 94 52 9 GAATTCGTGGTATAGCGTTACTTAATAACAATTATAAACTGTAAAATATAAATATTTTAT 94 4 70 

Qy 61 AAAAATAAAATTTGCAAGTTTTAATATATATTATTTTTAAAAATAAATCGTCCCGCGATA 120 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 94469 AAAAATAAAATTTGCAAGTTTTAATATATATTATTTTTAAAAATAAATCGTCCCGCGATA 94410' 

Qy 121 TACCGCGGGTTAAAATCTAGTTTCTTTTTGTTTATGTAACATCAATAGAGGTAATCTAAT 18 0 

1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 

Db 944 0 9 TACCGCGGGTTAAAATCTAGTTTCTTTTTGTTTATGTAACATCAATAGAGGTAATCTAAT 94 3 50 

Qy 181 TACCATTCATTTAAAATACCAAAAATAGGGGAAAAAATGTTCTTCGTTGGAATCCAATTG 24 0 

III I Ml 1 1 1 III MINI Mill 1 1 III I Mill II 1 1 II I III I II III III Ml Ml 

Db 94 34 9 TACCATTCATTTAAAATACCAAAAATAGGGGAAAAAATGTTCTTCGTTGGAATCCAATTG 94 2 90 



